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SEQUENCE LISTING 



(1) GENERAL INFORMATION : 

(i) APPLICANT: Bujard, Hermann 

Go s sen, Manfred 
Salfeld, Jochen G. 
Voss, Jeffrey W. 

(ii) TITLE^ OF INVENTION: Animals Transgenic for a Tetracycline - 

Controlled Transcriptional Transact iva tor 

(iii) NUMBER OF SEQUENCES: 10 
Civ) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Lahive & Cockfield 

(B) STREET: 60 State Street 
<C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02109-1875 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: ASCII text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/383,754 

(B) FILING DAE: 14-JUN-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/076,327 
* (B) FILING DAE: 14-JUN-1993 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: DeConti, Grulio A., Jr. 

(B) REGISTRATION NUMBER: 31,503 

(C) REFERENCE / DOCKET NUMBER: BBI-013CP2 



(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 227-7400 

(B) TELEFAX: (617) 227-5941 



INFORMATION FOR SEQ ID NO : 1 : 



(i) SEQUENCE CHARACTERISTICS: 
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<A) LENGTH: 1008 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 
10 (B) STRAIN: K12 , KOS 

(vii) IMMEDIATE SOURCE 

(B) CLONE: tTA transactivator 

!5 (ix) FEATURE: 

(A) NAME /KEY : exon 

(B) LOCATION: 1..1008 

(ix) FEATURE: 
20 (A) NAME /KEY: mRNA 

(B) LOCATION: 1..1008 

iQ ( ix ) FEATURE : 

CO (A) NAME/KEY: misc. binding 

p 5 (B) LOCATION: 1..207 

fU (ix) FEATURE: 

fU < A > NAME/KEY: misc. binding 

(B) LOCATION: 208.. 335 



30 



40 



45 



55 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1005 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG TCT AGA TTA GAT AAA AGT AAA GTG ATT AAC AGC GCA TTA GAG CTG 
Met Ser Arg Leu Asp Lys Ser Lys Val He Asn Ser Ala Leu Glu Leu 

CTT AAT GAG GTC GGA ATC GAA GGT TTA ACA ACC CGT AAA CTC GCC CAG 
Leu Asn Glu Val Gly He Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 
20 25 3Q 

AAG CTA GGT GTA GAG CAG CCT ACA TTG TAT TGG CAT GTA AAA AAT AAG 
Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
35 40 45 



CGG GCT TTG CTC GAC GCC TTA GCC ATT GAG ATG TTA GAT AGG CAC CAT 
50 ^ A 50 ASP H±S H±S 



55 



60 



ACT CAC TTT TGC CCT TTA GAA GGG GAA AGC TGG CAA GAT TTT TTA CGT 
Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Ara 
" 70 75 8 l 

AAT AAG GCT AAA AGT TTT AGA TGT GCT TTA CTA AGT CAT CGC GAT GGA 
Asn Lys Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 
85 90 95 



48 



96 



144 



192 



240 



288 
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GCA AAA GTA CAT TTA GGT ACA CGG CCT ACA GAA AAA CAG TAT GAA ACT 336 
Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 
100 105 110 

5 

CTC GAA AAT CAA TTA GCC TTT TTA TGC CAA CAA GGT TTT TCA CTA GAG 3 84 

Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
115 120 125 

10 AAT GCA TTA TAT GCA CTC AGC GCT GTG GGG CAT TTT ACT TTA GGT TGC 432 
Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
130 135 140 

GTA TTG GAA GAT CAA GAG CAT CAA GTC GCT AAA GAA GAA AGG GAA ACA 480 ' 
15 Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 

CCT ACT ACT GAT AGT ATG CCG CCA TTA TTA CGA CAA GCT ATC GAA TTA 52 8 
Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala lie Glu Leu 
20 165 170 175 

Q TTT GAT CAC CAA GGT GCA CAG CCA GCC TTC TTA TTC GGC CTT GAA TTG 576 

yp Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 
H 180 185 190 
25 

jfy ATC ATA TGC GGA TTA GAA AAA CAA CTT AAA TGT GAA AGT GGG TCC GCG 624 

fjj lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Ala 

f\\ 195 200 205 

; BP 

30 TAC AGC CGC GCG CGT ACG AAA AAC AAT TAC GGG TCT ACC ATC GAG GGC 672 

f=i Tyr Ser Arg Ala Arg Thr Lys Asn Asn Tyr Gly Ser Thr lie Glu Gly 

% 210 215 220 

V% CTG CTC GAT CTC CCG GAC GAC GAC GCC CCC GAA GAG GCG GGG CTG GCG 72 0 

5§ Leu Leu Asp Leu Pro Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala 
225 230 235 240 

GCT CCG CGC CTG TCC TTT CTC CCC GCG GGA CAC ACG CGC AGA CTG TCG 768 
Ala Pro Arg Leu Ser Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser 
40 245 250 255 

ACG GCC CCC CCG ACC GAT GTC AGC CTG GGG GAC GAG CTC CAC TTA GAC 816 
Thr Ala Pro Pro Thr Asp Val Ser Leu Gly Asp Glu Leu His Leu Asp 
260 265 270 

45 

GGC GAG GAC GTG GCG ATG GCG CAT GCC GAC GCG CTA GAC GAT TTC GAT 864 
Gly Glu Asp Val Ala Met Ala His Ala Asp Ala Leu Asp Asp Phe Asp 
275 280 285 

50 CTG GAC ATG TTG GGG GAC GGG GAT TCC CCG GGT CCG GGA TTT ACC CCC 912 
Leu Asp Met Leu Gly Asp Gly Asp Ser Pro Gly Pro Gly Phe Thr Pro 
290 295 300 

CAC GAC TCC GCC CCC TAC GGC GCT CTG GAT ATG GCC GAC TTC GAG TTT 96 0 

55 His Asp Ser Ala Pro Tyr Gly Ala Leu Asp Met Ala Asp Phe Glu Phe 
305 310 315 320 

GAG CAG ATG TTT ACC GAT CCC CTT GGA ATT GAC GAG TAC GGT GGG TAG 1008 
Glu Gin Met Phe Thr Asp Pro Leu Gly lie Asp Glu Tyr Gly Gly 
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325 330 335 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 335 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

10 Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 
15 10 15 

Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 
20 25 30 

15 

Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
35 40 45 

49 Arg Ala Leu Leu Asp Ala Leu Ala lie Glu Met Leu Asp Arg His His 
If 50 55 60 

fjj Thr His Phe Cys Pro Leu Glu Gly Gxu Ser Trp Gin Asp Phe Leu Arg 

fjj 65 70 75 80 

2§ - Asn Lys Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 
5 ' ~ 85 90 95 

55 Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

ioo los no 

IB 

jL; Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 

H 115 120 125 

Ms 

Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
35 130 135 140 

Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 

40 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala lie Glu Leu 

165 170 175 

Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 
180 185 190 

45 

lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Ala 
195 200 205 

Tyr Seir Arg Ala. Arg Thr Lys Asn Asn Tyr Gly Ser Thr lie Glu Gly 

50 210 215 220 

Leu Leu Asp Leu Pro Asp Asp Asp Ala Pro Glu Glu Ala Gly Leu Ala 
225 230 235 240 

55 Ala Pro Arg Leu Ser Phe Leu Pro Ala Gly His Thr Arg Arg Leu Ser 
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Thr Ala Pro Pro 
260 

Gly Glu Asp Val 
275 

Leu Asp Met Leu 
290 

His Asp Ser Ala 
305 

Glu Gin Met Phe 



245 

Thr Asp Val Ser 



Ala Met Ala His 
280 

Gly Asp Gly Asp 
295 

Pro Tyr Gly Ala 
310 

Thr Asp Pro Leu 
325 



250 

Leu Gly Asp Glu 
265 

Ala Asp Ala Leu 



Ser Pro Gly Pro 
300 

Leu Asp Met Ala 
315 

Gly lie Asp Glu 
330 



255 

Leu His Leu Asp 
270 

Asp Asp Phe Asp 
285 

Gly Phe Thr Pro 



Asp Phe Glu Phe 
320 

Tyr Gly Gly 
335 



(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 

(B) STRAIN: K12 , KOS 

(C) INDIVIDUAL ISOLATE: tTA s transactivator 



(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..894 



(ix) FEATURE: 

(A) NAME/KEY: mRNA 

(B) LOCATION: 1..894 



FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 

FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 



misc. binding 
1. .207 



misc . binding 
208 . .297 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..891 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



ATG TCT AGA TTA GAT AAA AGT AAA GTG ATT AAC AGC GCA TTA GAG CTG 
Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 
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15 10 15 

CTT AAT GAG GTC GGA ATC GAA GGT TTA ACA ACC CGT AAA CTC GCC CAG 96 
Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 
20 25 30 

5 AAG CTA GGT GTA GAG CAG CCT ACA TTG TAT TGG CAT GTA AAA AAT AAG 144 
Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
35 40 45 

CGG GCT TTG CTC GAC GCC TTA GCC ATT GAG ATG TTA GAT AGG CAC CAT 192 
Arg Ala Leu Leu Asp Ala Leu Ala lie Glu Met Leu Asp Arg His His 
10 50 55 60 

ACT CAC TTT TGC CCT TTA GAA GGG GAA AGC TGG CAA GAT TTT TTA CGT 24 0 
Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 
65 70 75 80 

AAT AAC GCT AAA AGT TTT AGA TGT GCT TTA CTA AGT CAT CGC GAT GGA 288 
15 Asri Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 

85 90 95 

GCA AAA GTA CAT TTA GGT ACA CGG CCT ACA GAA AAA CAG TAT GAA ACT 3 36 
fp Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 

m ioo 105 no 

'i 

CTC GAA AAT CAA TTA GCC TTT TTA TGC CAA CAA GGT TTT TCA CTA GAG 3 84 

fjj Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
M 115 120 125 

AAT GCA TTA TAT GCA CTC AGC GCT GTG GGG CAT TTT ACT TTA GGT TGC 432 
p% Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
|p 130 135 140 

GTA TTG GAA GAT CAA GAG CAT CAA GTC GCT AAA GAA GAA AGG GAA ACA 48 0 
Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
^ 145 150 155 160 

CCT ACT ACT GAT AGT ATG CCG CCA TTA TTA CGA CAA GCT ATC GAA TTA 528 
3 0 Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala lie Glu Leu 

165 170 175 

TTT GAT CAC CAA GGT GCA GAG CCA GCC TTC TTA TTC GGC CTT GAA TTG 5 76 
Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 
180 185 190 

3 5 ATC ATA TGC GGA TTA GAA AAA CAA CTT AAA TGT GAA AGT GGG TCT GAT 6 24 

lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Asp 
195 200 205 

CCA TCG ATA CAC ACG CGC AGA CTG TCG ACG GCC CCC CCG ACC GAT GTC 6 72 

Pro Ser lie His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr Asp Val 
40 210 215 220 

AGC CTG GGG GAC GAG CTC CAC TTA GAC GGC GAG GAC GTG GCG ATG GCG 720 
Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala Met Ala 
225 230 235 240 

CAT GCC GAC GCG CTA GAC GAT TTC GAT CTG GAC ATG TTG GGG GAC GGG 76 8 
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His Ala Asp Ala Leu Asp 
245 

GAT TCC CCG GGT CCG GGA 
Asp Ser Pro Gly Pro Gly 
5 260 

GCT CTG GAT ATG GCC GAC 
Ala Leu Asp Met Ala Asp 
275 

CTT GGA ATT GAC GAG TAC 
10 Leu Gly lie Asp Glu Tyr 
290 



Asp Phe Asp Leu Asp Met 
250 

TTT ACC CCC CAC GAC TCC 
Phe Thr Pro His Asp Ser 
265 

TTC GAG TTT GAG CAG ATG 
Phe Glu Phe Glu Gin Met 
280 

GGT GGG TTC TAG 
Gly Gly Phe 
295 



Leu Gly Asp Gly 
255 

GCC CCC TAC GGC 816 
Ala Pro Tyr Gly 
270 

TTT ACC GAT GCC 864 

Phe Thr Asp Ala 

285 

894 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 97 amino acids 
15 (B) TYPE: amino acid 

□ (D) TOPOLOGY: linear 

fk (ii) MOLECULE TYPE: protein 

ffj (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Pi 

f|E Met Ser Arg Leu Asp Lys Ser Lys Val lie Asn Ser Ala Leu Glu Leu 
ISO 1 5 10 15 

Leu Asn Glu Val Gly lie Glu Gly Leu Thr Thr Arg Lys Leu Ala Gin 
20 25 30 

y i 

|^5 Lys Leu Gly Val Glu Gin Pro Thr Leu Tyr Trp His Val Lys Asn Lys 
35 40 45 



y | 
0 



M= Arg Ala Leu Leu Asp Ala Leu Ala lie Glu Met Leu Asp Arg His His 
50 55 60 

30 

Thr His Phe Cys Pro Leu Glu Gly Glu Ser Trp Gin Asp Phe Leu Arg 
65 70 75 80 

Asn Asn Ala Lys Ser Phe Arg Cys Ala Leu Leu Ser His Arg Asp Gly 
35 85 90 95 

Ala Lys Val His Leu Gly Thr Arg Pro Thr Glu Lys Gin Tyr Glu Thr 
100 105 110 

40 Leu Glu Asn Gin Leu Ala Phe Leu Cys Gin Gin Gly Phe Ser Leu Glu 
115 120 125 

Asn Ala Leu Tyr Ala Leu Ser Ala Val Gly His Phe Thr Leu Gly Cys 
130 135 140 

45 

Val Leu Glu Asp Gin Glu His Gin Val Ala Lys Glu Glu Arg Glu Thr 
145 150 155 160 



50 



Pro Thr Thr Asp Ser Met Pro Pro Leu Leu Arg Gin Ala He Glu Leu 
165 170 175 
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10 



Phe Asp His Gin Gly Ala Glu Pro Ala Phe Leu Phe Gly Leu Glu Leu 
180 185 190 

lie lie Cys Gly Leu Glu Lys Gin Leu Lys Cys Glu Ser Gly Ser Asp 
195 200 205 

Pro Ser lie His Thr Arg Arg Leu Ser Thr Ala Pro Pro Thr Asp Val 
210 215 220 

Ser Leu Gly Asp Glu Leu His Leu Asp Gly Glu Asp Val Ala Met Ala 
225 230 235 240 



His Ala Asp Ala Leu Asp Asp Phe Asp Leu Asp Met Leu Gly Asp Gly 
15 245 250 255 

Asp Ser Pro Gly Pro Gly Phe Thr Pro His Asp Ser Ala Pro Tyr Gly 
260 265 270 

2 0 .Ala Leu Asp Met Ala Asp Phe Glu Phe Glu Gin Met Phe Thr Asp Ala 
- 275 280 285 

iQ Leu Gly lie Asp Glu Tyr Gly Gly Phe 
|| 290 295 

5is 

ft! 

f}\ (2) INFORMATION FOR SEQ ID NO: 5: 

fU 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 base pairs 
Jkp £B) TYPE: nucleic acid 

g! (C) STRAND EDNESS : double 

|;! (D) TOPOLOGY: linear 

!Lj (ii) MOLECULE TYPE: DNA (genomic) 

it 

H (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: K12 , Towne 

4 0 (ix) FEATURE: 

<A) NAME /KEY : mRNA 

(B) LOCATION: 382,. 450 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

45 GAATTCCTCG AGTTTAC C AC TCCCTATCAG TGATAGAGAA AAGTGAAAGT CGAGTTTACC 60 

ACTCCCTATC AGTGATAGAG AAAAGTGAAA GTCGAGTTTA CCACTCCCTA TCAGTGATAG 12 0 

AGAAAAGTGA AAGTCGAGTT TACCACTCCC TATCAGTGAT AGAGAAAAGT GAAAGTCGAG 180 

TTTACCACTC CCTATCAGTG ATAGAGAAAA GTGAAAGTCG AGTTTAC C AC TCCCTATCAG 240 

TGATAGAGAA AAGTGAAAGT CGAGTTTACC ACTCCCTATC AGTGATAGAG AAAAGTGAAA 300 

5 0 GTCGAGCTCG GTACCCGGGT CGAGTAGGCG TGTACGGTGG GAGGCCTATA TAAGCAGAGC 360 
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TCGTTTAGTG AACCGTCAGA TCGCCTGGAG ACGCCATCCA CGCTGTTTTG ACCTCCATAG 
AAGACACCGG GACCGATCCA GCCTCCGCGG 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: Towne 

(ix) FEATURE: 

(A) NAME /KEY: mRNA 

(B) LOCATION: 382.. 450 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
GAATTCCTCG ACCCGGGTAC CGAGCTCGAC TTTCACTTTT CTCTATCACT GATAGGGAGT 
GGTAAACTCG ACTTTCACTT TTCTCTATCA CTGATAGGGA GTGGTAAACT CGACTTTCAC 
TTTTCTCTAT CACTGATAGG GAGTGGTAAA CTCGACTTTC ACTTTTCTCT ATCACTGATA 
GGGAGTGGTA AACTCGACTT TCACTTTTCT CTATCACTGA TAGGGAGTGG TAAACTCGAC 
TTTCACTTTT CTCTATCACT GATAGGGAGT GGTAAACTCG ACTTTCACTT TTCTCTATCA 
CTGATAGGGA GTGGTAAACT CGAGTAGGCG TGTACGGTGG GAGGCCTATA TAAGCAGAGC 
TCGTTTAGTG AACCGTCAGA TCGCCTGGAG ACGCCATCCA CGCTGTTTTG ACCTCCATAG 
AAGACACCGG GACCGATCCA GCCTCCGCGG 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 398 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Herpes Simplex Virus 

(B) STRAIN: KOS 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
GAGCTCGACT TTCACTTTTC TCTATCACTG ATAGGGAGTG GTAAACTCGA CTTTCACTTT 
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TCTCTATCAC TGATAGGGAG TGGTAAACTC GACTTTCACT TTTCTCTATC ACTGATAGGG 
AGTGGTAAAC TCGACTTTCA CTTTTCTCTA TCACTGATAG GGAGTGGTAA ACTCGACTTT 
CACTTTTCTC TATCACTGAT AGGGAGTGGT AAACTCGACT TTCACTTTTC TCTATCACTG 
ATAGGGAGTG GTAAACTCGA CTTTCACTTT TCTCTATCAC TGATAGGGAG TGGTAAACTC 
GAGATCCGGC GAATTCGAAC ACGCAGATGC AGTCGGGGCG GCGCGGTCCG AGGTCCACTT 
CGCATATTAA GGTGACGCGT GTGGCCTCGA ACACCGAG 
(2) INFORMATION FOR SSQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6244 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

<ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(B) STRAIN: Towne (hCMV) 

(vii) IMMEDIATE SOURCE : 
(B) CLONE: pUHD BGR3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

CTCGAGTTTA CCACTCCCTA TCAGTGATAG AGAAAAGTGA AAGTCGAGTT TACCACTCCC 

TATCAGTGAT AGAGAAAAGT GAAAGTCGAG TTTACCACTC CCTATCAGTG ATAGAGAAAA 

GTGAAAGTCG AGTTTACCAC TCCCTATCAG TGATAGAGAA AAGTGAAAGT CGAGTTTACC 

ACTCCCTATC AGTGATAGAG AAAAGTGAAA GTCGAGTTTA CCACTCCCTA TCAGTGATAG 

AGAAAAGTGA AAGTCGAGTT TACCACTCCC TATCAGTGAT AGAGAAAAGT GAAAGTCGAG 

CTCGGTACCC GGGTCGAGTA GGCGTGTACG GTGGGAGGCC TATATAAGCA GAGCTCGTTT 

AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC ATAGAAGACA 

CCGGGACCGA TCCAGCCTCC GCGGCCCCGA ATTCGAGCTC GGTACCGGGC CCCCCCTCGA 

GGTCGACGGT ATCGATAAGC TTGATATCGA ATTCCAGGAG GTGGAGATCC GCGGGTC CAG 

CCAAACCCCA CACCCATTTT CTCCTCCCTC TGCCCCTATA TCCCGGCACC CCCTCCTCCT 

AGCCCTTTCC CTCCTCCCGA GAGACGGGGG AGGAGAAAAG GGGAGTTCAG GTCGACATGA 

CTGAGCTGAA GGCAAAGGAA CCTCGGGCTC CCCACGTGGC GGGCGGCGCG CCCTCCCCCA 

CCGAGGTCGG ATCCCAGCTC CTGGGTCGCC CGGACCCTGG CCCCTTCCAG GGGAGCCAGA 

CCTCAGAGGC CTCGTCTGTA GTCTCCGCCA TCCCCATCTC CCTGGACGGG TTGCTCTTCC 
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CCCGGCCCTG TCAGGGGCAG AACCCCCCAG ACGGGAAGAC GCAGGACCCA CCGTCGTTGT 900 

CAGACGTGGA GGGCGCATTT CCTGGAGTCG AAGCCCCGGA GGGGGCAGGA GACAGCAGCT 960 

CGAGACCTCC AGAAAAGGAC AGCGGCCTGC TGGACAGTGT CCTCGACACG CTCCTGGCGC 1020 

CCTCGGGTCC CGGGCAGAGC CACGCCAGCC CTGCCACCTG CGAGGCCATC AGCCCGTGGT 1080 

5 GCCTGTTTGG CCCCGACCTT CCCGAAGACC CCCGGGCTGC CCCCGCTACC AAAGGGGTGT 114 0 

TGGCCCCGCT CATGAGCCGA CCCGAGGACA AGGCAGGCGA CAGCTCTGGG ACGGCAGCGG 1200 

CCCACAAGGT GCTGCCCAGG GGACTGTCAC CATCCAGGCA GCTGCTGCTC CCCTCCTCTG 1260 

GGAGCCCTCA CTGGCCGGCA GTGAAGCCAT CCCCGCAGCC CGCTGCGGTG CAGGTAGACG 1320 

AGGAGGACAG CTCCGAATCC GAGGGCACCG TGGGCCCGCT CCTGAAGGGC CAACCTCGGG 13 80 

10 CACTGGGAGG CACGGCGGCC GGAGGAGGAG CTGCCCCCGT CGCGTCTGGA GCGGCCGCAG 1440 

GAGGCGTCGC CCTTGTCCCC AAGGAAGATT CTCGCTTCTC GGCGCCCAGG GTCTCCTTGG 1500 

y§ CGGAGCAGGA CGCGCCGGTG GCGCCTGGGC GCTCCCCGCT GGCCACCTCG GTGGTGGATT 1560 

S TCATCCACGT GCCCATCCTG CCTCTCAACC ACGCTTTCCT GGCCACCCGC ACCAGGCAGC 1620 

Hi 

|^ TGCTGGAGGG GGAGAGCTAC GACGGCGGGG CCGCGGCCGC CAGCCCCTTC GTCCCGCAGC 168 0 

7- SH? 

ess; : 

y| GGGGCTCCCC CTCTGCCTCG TCCACCCCTG TGGCGGGCGG CGACTTCCCC GACTGCACCT 174 0 

ACCCGCCCGA CGCCGAGCCC AAAGATGACG CGTTCCCCCT CTACGGCGAC TTCCAGCCGC 1800 

|;{ CCGCCCTCAA GATAAAGGAG GAGGAAGAAG CCGCCGAGGC CGCGGCGCGC TCCCCGCGTA 1860 

f'-i | 

111 CGTACCTGGT GGCTGGTGCA AACCCCGCCG CCTTCCCGGA CTTCCAGCTG GCAGCGCCGC 1920 

C3 

CGCCACCCTC GCTGCCGCCT CGAGTGCCCT CGTCCAGACC CGGGGAAGCG GCGGTGGCGG 1980 

20 CCTCCCCAGG CAGTGCCTCC GTCTCCTCCT CGTCCTCGTC GGGGTCGACC CTGGAGTGCA 2040 

TCCTGTACAA GGCAGAAGGC GCGCCGCCCC AGCAGGGCCC CTTCGCGCCG CTGCCCTGCA 2100 

AGCCTCCGGG CGCCGGCGCC TGCCTGCTCC CGCGGGACGG CCTGCCCTCC ACCTCCGCCT 2160 

CGGGCGCAGC CGCCGGGGCC GCCCCTGCGC TCTACCCGAC GCTCGGCCTC AACGGACTCC 2220 

CGCAACTCGG CTACCAGGCC GCCGTGCTCA AGGAGGGCCT GCCGCAGGTC TACACGCCCT 2280 

2 5 ATCTCAACTA CCTGAGGCCG GATTCAGAAG CCAGTCAGAG CCCACAGTAC AGCTTCGAGT 2 340 

CACTACCTCA GAAGATTTGT TTGAT CTGTG GGGATGAAGC ATCAGGCTGT CATTATGGTG 24 0 0 

TCCTCACCTG TGGGAGCTGT AAGGTCTTCT TTAAAAGGGC AATGGAAGGG CAGCATAACT 24 60 

ATTTATGTGC TGGAAGAAAT GACTGCATTG TTGATAAAAT CCGCAGGAAA AACTGCCCGG 2 52 0 

CGTGTCGCCT TAGAAAGTGC TGTCAAGCTG GCATGGTCCT TGGAGGGCGA AAGTTTAAAA 2 58 0 

3 0 AGTTCAATAA AGTCAGAGTC ATGAGAGCAC TCGATGCTGT TGCTCTCCCA CAGCCAGTGG 264 0 



GCATTCCAAA TGAAAGCGAA CGAATCACTT 
CCCCTCTAAT CAACCTGTTA ATGAGCATTG 
ACACAAAGCC TGATACCTCC AGTTCTTTGC 
AACTTCTTTC AGTGGTAAAA TGGTCCAAAT 
5 ATGACCAGAT AACTCTCATC CAGTATTCTT 
GGAGATCCTA CAAACATGTC AGTGGGCAGA 
ATGAACAGCG GATGAAAGAA TCATCATTCT 
CGCAGGAGTT TGTCAAGCTT CAAGTTAGCC 
TACTTCTTAA TACAATTCCT TTGGAAGGAC 
10 GATCAAGCTA CATTAGAGAG CTCATCAAGG 
CCAGCTCACA GCGTTTCTAT CAGCTCACAA 
-gj AACAACTTCA CCTGTACTGC CTGAATACAT 
m TTCCAGAAAT GATGTCTGAA GTTATTGCTG 
m TGAAACCACT TCTCTTTCAT AAAAAGTGAA 

f\ \ 

1^5 GTGGTATGTC TTTCGTTTTG GTCAGGATTA 
AGGGAATTCC TGCAGCCCGG GGGATCCACT 
II I CATTGATGAG TTTGGACAAA CCACAACTAG 
tfl AATTTGTGAT GCTATTGCTT TATTTGTAAC 
hk CAACAATTGC ATTCATTTTA TGTTTCAGGT 
2 0 CAAGTAAAAC CTCTACAAAT GTGGTATGGC 
GCCGGACCAC GCTATCTGTG CAAGGTCCCC 
CGCCGAGGCA AGACTCGGGC GGCGCCCTGC 
GCCTCTTCAT CGGGAATGCG CGCGACCTTC 
GAAGTATCAG CTCGACCAAG CTTGGCGAGA 
25 AAAAAAATCA CTGGATATAC CACCGTTGAT 
GAGGCATTTC AGTCAGTTGC TCAATGTACC 
ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG 
ACTGACTCGC TGCGCTCGGT CGTTCGGCTG 
GTAATACGGT TATCCACAGA ATCAGGGGAT 
3 0 CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 
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TTTCTCCAAG TCAAGAGATA CAGTTAATTC 2 700 

AACCAGATGT GATCTATGCA GGACATGACA 2760 

TGACGAGTCT TAATCAACTA GGCGAGCGGC 2 820 

CTCTTCCAGG TTTTCGAAAC TTACATATTG 28 8 0 

GGATGAGTTT AATGGTATTT GGACTAGGAT 2 940 

TGCTGTATTT TGCACCTGAT CTAATATTAA 3 0 00 

ATTCACTATG CCTTACCATG TGGCAGATAC 3 060 

AAGAAGAGTT CCTCTGCATG AAAGTATTAC 312 0 

TAAGAAGTCA AAGCCAGTTT GAAGAGATGA 318 0 

CAATTGGTTT GAGGCAAAAA GGAGTTGTTT 3240 

AACTTCTTGA TAACTTGCAT GATCTTGTCA 33 00 

TTATCCAGTC CCGGGCGCTG AGTGTTGAAT 3 3 60 

CACAGTTACC CAAGATATTG GCAGGGATGG 342 0 

TGTCAATTAT TTTTCAAAGA ATTAAGTGTT 3480 

TGACGTCTCG AGTTTTTATA ATATTCTGAA 3 540 

AGTTCTAGAG GATCCAGACA TGATAAGATA 3 600 

AATGCAGTGA AAAAAATGCT TTATTTGTGA 3 66 0 

CATTATAAGC TGCAATAAAC AAGTTAACAA 3 72 0 

TCAGGGGGAG GTGTGGGAGG TTTTTTAAAG 3 780 

TGATTATGAT CCTGCAAGCC TCGTCGTCTG 3 84 0 

GGACGCGCGC TCCATGAGCA GAGCGCCCGC 3 900 

CCGTCCCACC AGGTCAACAG GCGGTAACCG 3 96 0 

AGCATCGCCG GCATGTCCCC TGGCGGACGG 4020 

TTTTCAGGAG CTAAGGAAGC TAAAATGGAG 4080 

ATATCC CAAT GGCATCGTAA AGAACATTTT 414 0 

TATAACCAGA CCGTTCAGCT GCATTAATGA 4 2 00 

CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 4260 

CGGCGAGCGG TATCAGCTCA CTCAAAGGCG 4320 

AACGCAGGAA AGAACATGTG AGCAAAAGGC 43 80 

GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC 444 0 
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CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA 4500 

CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC 4560 

CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC GCTTTCTCAA 462 0 

TGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG 4 68 0 

5 CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG TCTTGAGTCC 474 0 

AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA 4 800 

GCGAGGTATG TAGGCGG TGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT 4860 

AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTAC CTTCGG AAAAAGAGTT 4 92 0 

GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG 4 98 0 

10 CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG 5 04 0 

TCTGACGCTC AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA 5100 

AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA 5160 

§'"{"% 

TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC TATCTCAGCG 5220 

f7\ ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCCCCG TCGTGTAGAT AACTACGATA 52 8 0 

Jfe CGGGAGGGCT TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG 53 40 

*„. GCTCCAGATT TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT 54 00 

|f. GCAACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT 546 0 

1 y 

W TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT GGTGTCACGC 552 0 

M TCGTCGTTTG GTATGGCTTC ATTCAGCTCC GGTTCCCAAC GATCAAGGCG AGTTACATGA 5580 

20 TCCCCCATGT TGTGCAAAAA AGCGGTTAGC TCCTTCGGTC CTCCGATCGT TGTCAGAAGT 564 0 

AAGTTGGCCG CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC 5700 

ATGCCATCCG TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC ATTCTGAGAA 576 0 

TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA TACCGCGCCA 582 0 

CATAGCAGAA CTTTAAAAGT GCTCATCATT GGAAAACGTT CTTCGGGGCG AAAACTCTCA 5880 

2 5 AGGATCTTAC CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT 5 940 

TCAGCATCTT TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC 6 000 

GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA 60 6 0 

TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT TGAATGTATT 612 0 

TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC ACCTGACGTC 618 0 

30 TAAGAAACCA TT AT TAT CAT GACATTAACC TATAAAAATA GGCGTATCAC GAGGCCCTTT 624 0 
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CGTC 6244 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4963 base pairs 

5 (B) TYPE: nucleic acid 

( C ) STRANDEDNES S : doubl e 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Human cytomegalovirus 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pUHD BGR4 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CTCGAGTTTA CCACTCCCTA TCAGTGATAG AGAAAAGTGA AAGTCGAGTT TACCACTCCC 6 0 

JSj TATCAGTGAT AGAGAAAAGT GAAAGTCGAG TTTACCACTC CCTATCAGTG ATAGAGAAAA 120 

GTGAAAGTCG AGTTTACCAC TCCCTATCAG TGATAGAGAA AAGTGAAAGT CGAGTTTACC 18 0 

\& ACTCCCTATC AGTGATAGAG AAAAGTGAAA GTCGAGTTTA CCACTCCCTA TCAGTGATAG 24 0 

j^O AGAAAAGTGA AAGTCGAGTT TACCACTCCC TATCAGTGAT AGAGAAAAGT GAAAGTCGAG 3 00 

l„ CTCGGTACCC GGGTCGAGTA GGCGTGTACG GTGGGAGGCC TATATAAGCA GAGCTCGTTT 3 60 

P AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC ATAGAAGACA 420 

s i i 

If! CCGGGACCGA TCCAGCCTCC GCGGCCCCGA ATTCCGGCCA CGACCATGAC CATGACCCTC 480 

M= CACACCAAAG CATCTGGGAT GGCCCTACTG CATCAGATCC AAGGGAACGA GCTGGAGCCC 54 0 

25 CTGAACCGTC CGCAGCTCAA GATCCCCCTG GAGCGGCCCC TGGGCGAGGT GTACCTGGAC 600 

AGCAGCAAGC CCGCCGTGTA CAACTACCCC GAGGGCGCCG CCTACGAGTT CAACGCCGCG 66 0 

GCCGCCGCCA ACGCGCAGGT CTACGGTCAG ACCGGCCTCC CCTACGGCCC CGGGTCTGAG 720 

GCTGCGGCGT TCGGCTCCAA CGGCCTGGGG GGTTTCCCCC CACTCAACAG CGTGTCTCCG 780 

AGCCCGCTGA TGCTACTGCA CCCGCCGCCG CAGCTGTCGC CTTTCCTGCA GCCCCACGGC 84 0 

30 CAGCAGGTGC CCTACTACCT GGAGAACGAG CCCAGCGGCT ACACGGTGCG CGAGGCCGGC 900 

CCGCCGGCAT TCTACAGGCC AAATTCAGAT AATCGACGCC AGGGTGGCAG AGAAAGATTG 960 

GCCAGTACCA ATGACAAGGG AAGTATGGCT ATGGAATCTG C CAAGG AG AC TCGCTACTGT 1020 

GCAGTGTGCA ATGACTATGC TTCAGGCTAC CATTATGGAG TCTGGTCCTG TGAGGGCTGC 1080 

AAGGC CTTCT TCAAGAGAAG TATTCAAGGA CATAACGACT ATATGTGTCC AGCCACCAAC 1140 

35 C AGTGC AC C A TTGATAAAAA CAGGAGGAAG AGCTGCCAGG CCTGCCGGCT CCGCAAATGC 12 00 



TACGAAGTGG GAATGATGAA AGGTGGGATA 
AAACACAAGC GCCAGAGAGA TGATGGGGAG 
ATGAGAGCTG CCAACCTTTG GCCAAGCCCG 
CTGGCCTTGT CCCTGACGGC CGACCAGATG 
5 ATACTCTATT CCGAGTATGA TCCTACCAGA 
CTGACCAACC TGGCAGACAG GGAGCTGGTT 
GGCTTTGTGG ATTTGACCCT CCATGATCAG 
ATCCTGATGA TTGGTCTCGT CTGGCGCTCC 
CCTAACTTGC TCTTGGACAG GAACCAGGGA 
10 GACATGCTGC TGGCTACATC ATCTCGGTTC 
GTGTGCCTCA AATCTATTAT TTTGCTTAAT 
% CTGAAGTCTC TGGAAGAGAA GGACCATATC 
W TTGATCCACC TGATGGCCAA GGCAGGCCTG 

MLS 

CAGCTCCTCC TCATCCTCTC CCACATCAGG 
p4 TACAGCATGA AGTGCAAGAA CGTGGTGCCC 
l_ GCCCACCGCC TACATGCGCC CACTAGCCGT 
in AGCCACTTGG CCACTGCGGG CTCTACTTCA 

I %z 

Lfl GGGGAGGCAG AGGGTTTCCC TGCCACAGTC 
%h TACCCGGGGA TCCTCTAGAG GATCCAGACA 
20 CCACAACTAG AATGCAGTGA AAAAAATGCT 
TATTTGTAAC CATTATAAGC TGCAATAAAC 
TGTTTCAGGT TCAGGGGGAG GTGTGGGAGG 
GTGGTATGGC TGATTATGAT CCTGCAAGCC 
CAAGGTCCCC GGACGCGCGC TCCATGAGCA 
2 5 GGCGCCCTGC CCGTCCCACC AGGTCAACAG 
CGCGACCTTC AGCATCGCCG GCATGTCCCC 
CTTGGCGAGA TTTTCAGGAG CTAAGGAAGC 
CACCGTTGAT ATATCCCAAT GGCATCGTAA 
TCAATGTACC TATAACCAGA CCGTTCAGCT 
30 AGGCGGTTTG CGTATTGGGC GCTCTTCCGC 
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CGAAAAGACC GAAGAGGAGG GAGAATGTTG 12 60 

GGCAGGGGTG AAGTGGGGTC TGCTGGAGAC 1320 

CTCATGATCA AACGCTCTAA GAAGAACAGC 138 0 

GTCATGGCCT TGTTGGATGC TGAGCCCCCC 144 0 

CCCTTCAGTG AAGCTTCGAT GATGGGCTTA 1500 

CACATGATCA ACTGGGCGAA GAGGGTGCCA 1560 

GTCCACCTTC TAGAATGTGC CTGGCTAGAG 162 0 

ATGGAGC AC C CAGTGAAGCT ACTGTTTGCT 168 0 

AAATGTGTAG AGGGCATGGT GGAGATCTTC 174 0 

CGCATGATGA ATCTGCAGGG AGAGGAGTTT 18 0 0 

TCTGGAGTGT ACACATTTCT GTCCAGCACC 186 0 

CACCGAGTCC TGGACAAGAT CACAGACACT 192 0 

ACCCTGCAGC AGCAGCACCA GCGGCTGGCC 1980 

CACATGAGTA ACAAAGGCAT GGAGCATCTG 2040 

CTCTATGACC TGCTGCTGGA GATGCTGGAC 2100 

GGAGGGGCAT CCGTGGAGGA GACGGACCAA 216 0 

TCGCATTCCT TGCAAAAGTA TTACATCACG 222 0 

TGAGAGCTCC CTGGCGGAAT TCGAGCTCGG 22 80 

TGATAAGATA CATTGATGAG TTTGGACAAA 2340 

TTATTTGTGA AATTTGTGAT GCTATTGCTT 2400 

AAGTTAACAA CAACAATTGC ATTCATTTTA 246 0 

TTTTTTAAAG CAAGTAAAAC CTCTACAAAT 2520 

TCGTCGTCTG GCCGGACCAC GCTATCTGTG 2 58 0 

GAGCGCCCGC CGCCGAGGCA AGACTCGGGC 264 0 

GCGGTAACCG GCCTCTTCAT CGGGAATGCG 27 00 

TGGCGGACGG GAAGTATCAG CTCGACCAAG 2 76 0 

TAAAATGGAG AAAAAAATCA CTGGATATAC 282 0 

AGAACATTTT GAGGCATTTC AGTCAGTTGC 2 88 0 

GCATTAATGA ATCGGCCAAC GCGCGGGGAG 2 94 0 

TTCCTCGCTC ACTGACTCGC TGCGCTCGGT 3 0.00 



CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 
ATCAGGGGAT AACGCAGGAA AGAACATGTG 
TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA 
AAATCGACGC TCAAGTCAGA GGTGGCGAAA 
5 TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 
GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 
CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 
CGACCGCTGC GCCTTATCCG GTAACTATCG 
ATCGCCACTG GCAGCAGCCA CTGGTAACAG 
10 TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 
2 ACAAACCACC GCTGGTAGCG GTGGTTTTTT 
S AAAAGGATCT CAAGAAGATC CTTTGATCTT 
Ifl AAACTCACGT TAAGGGATTT TGGTCATGAG 
1% TTTAAATTAA AAATGAAGTT TTAAATCAAT 
| tt CAGTTACCAA TGCTTAATCA GTGAGGCACC 
j;! CATAGTTGCC TGATCCCCGT CGTGTAGATA 
IH CCCAGTGCTG CAATGATACC GCGAGACCCA 
AACCAGCCAG CCGGAAGGGC CGAGCGCAGA 
20 CAGTCTATTA ATTGTTGCCG GGAAGCTAGA 
AACGTTGTTG CCATTGCTAC AGGCATCGTG 
TTCAGCTCCG GTTCCCAACG ATCAAGGCGA 
GCGGTTAGCT CCTTCGGTCC TCCGATCGTT 
CTCATGGTTA TGGCAGCACT GCATAATTCT 

2 5 TCTGTGACTG GTGAGTACTC AACCAAGTCA 

TGCTCTTGCC CGGCGTCAAT ACGGGATAAT 
CTCATCATTG GAAAACGTTC TTCGGGGCGA 
TCCAGTTCGA TGTAACCCAC TCGTGCACCC 
AGCGTTTCTG GGTGAGCAAA AACAGGAAGG 

3 0 ACACGGAAAT GTTGAATACT CATACTCTTC 
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CTCAAAGGCG GTAATACGGT TATC CACAGA 3 06 0 

AGCAAAAGGC CAGCAAAAGG CCAGGAACCG 3120 

TAGGCTCCGC CCCCCTGACG AGCATCACAA 3180 

CCCGACAGGA CTATAAAGAT ACCAGGCGTT 324 0 

TGTTCCGACC CTGCCGCTTA CCGGATACCT 33 00 

GCTTTCTCAA TGCTCACGCT GTAGGTATCT 3 360 

GGGCTGTGTG CACGAACCCC CCGTTCAGCC 34 2 0 

TCTTGAGTCC AACCCGGTAA GACACGACTT 34 8 0 

GATTAGCAGA GCGAGGTATG TAGGCGGTGC 3 540 

CGGCTACACT AGAAGGACAG TATTTGGTAT 3600 

AAAAAGAGTT GGTAGCTCTT GATCCGGCAA 3660 

TG TTTGCAAG CAGCAGATTA CGCGCAGAAA 3 720 

TTCTACGGGG TCTGACGCTC AGTGGAACGA 3 780 

ATTATCAAAA AGGATCTTCA CCTAGATCCT 3 840 

CTAAAGTATA TATGAGTAAA CTTGGTCTGA 3 900 

TATCTCAGCG ATCTGTCTAT TTCGTTCATC 3 960 

ACTACGATAC GGGAGGGCTT ACCATCTGGC 4 02 0 

CGCTCACCGG CTCCAGATTT AT CAGC AATA 4 08 0 

AGTGGTCCTG CAACTTTATC CGCCTCCATC 414 0 

GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC 4200 

GTGTCACGCT CGTCGTTTGG TATGGCTTCA 426 0 

GTTACATGAT CCCCCATGTT GTGCAAAAAA 4 32 0 

GTCAGAAGTA AGTTGGCCGC AG TGTTATCA 438 0 

CTTACTGTCA TGCCATCCGT AAGATGCTTT 444 0 

TTCTGAGAAT AGTGTATGCG GCGACCGAGT 4 500 

ACCGCGCCAC ATAGCAGAAC TTTAAAAGTG 4 56 0 

AAACTCTCAA GGATCTT AC CGCTGTTG AG A 4 62 0 

AACTGATCTT CAGCATCTTT TACTTTCACC 46 8 0 

CAAAATGCCG CAAAAAAGGG AATAAGGGCG 4 74 0 

CTTTTTCAAT ATTATTGAAG CATTTATCAG 4 800 



GGTTATTGTC TCATGAGCGG ATACATATTT 
GTTCCGCGCA CATTTCCCCG AAAAGTGCCA 
ACATTAACCT ATAAAAATAG GCGTATCACG 
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GAATGTATTT AGAAAAATAA ACAAATAGGG 4860 

CCTGACGTCT AAGAAACCAT TATTATCATG 4 92 0 

AGGCCCTTTC GTC 4963 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

{xi} SEQUENCE DESCRIPTION : SEQ ID NO:10: 



TCGAGTTTAC CACTCCCTAT CAGTGATAGA GAAAAGTGAA AG 



42 



